Neural Acoustic-Phonetic Approach for Speaker Verification With Phonetic Attention Mask

نویسندگان

چکیده

Traditional acoustic-phonetic approach makes use of both spectral and phonetic information when comparing the voice speakers. While units are not equally informative, context speech plays an important role in speaker verification (SV). In this paper, we propose a neural that learns to dynamically assign differentiated weights features for SV. Such form attention mask (PAM). The framework consists two training pipelines, one SV another recognition. Through PAM, leverage We evaluate proposed on RSR2015 database Part III corpus, random digit strings. show with PAM consistently outperforms baseline equal error rate reduction 13.45% 10.20% female male data, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phonetic, idiolectal and acoustic speaker recognition

This paper describes a text-independent speaker recognition system that achieves an equal error rate of less than 1% by combining phonetic, idiolect, and acoustic features. The phonetic system is a novel language-independent speakerrecognition system based on differences among speakers in dynamic realization of phonetic features (i.e., pronunciation), rather than spectral differences in voice q...

متن کامل

Speaker verification based on phonetic decision making

Speaker verification based on phone modelling is examined in this paper. Phone modelling is attractive, because different phonemes have different levels of usefulness for speaker recognition, and because phone modelling essentially makes a speaker verification algorithm text independent. The speaker verification system used here is based on a two stage approach, where speech recognition (segmen...

متن کامل

Speaker verification based on broad phonetic categories

In this work we present a speaker verification system based on 4 broad phonetic categories: vowels+diphthongs, fricatives, glides+nasals, and silence+stops. Using these categories separately, it is observed that vowels, diphthongs, and fricatives are the most important categories for speaker verification. This observation confirms the results from the analysis of speaker and channel variability...

متن کامل

Phonetic vocoding with speaker adaptation

This paper describes a phonetic vocoding scheme which relies on speaker adaptation to capture important speaker characteristics. These are typically lost in phonetic vocoders which transmit only information about the phones which are recognized, together with some prosodic information. In our scheme, however, additional speaker characteristics are transmitted in vowel regions (average values of...

متن کامل

Fusing acoustic, phonetic and data-driven systems for text-independent speaker verification

This paper describes our recent efforts in exploring datadriven high-level features and their combination with low-level spectral features for speaker verification. In particular, we compare the phonetic and data-driven approaches and study their complementarity with short-term acoustic approach. Our objective is to show that data-driven units automatically acquired from the speech data, can be...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Signal Processing Letters

سال: 2022

ISSN: ['1558-2361', '1070-9908']

DOI: https://doi.org/10.1109/lsp.2022.3143036